Search CORE

43 research outputs found

Genome Halving by Block Interchange

Author: Ouangraoua Aïda
Thomas Antoine
Varré Jean-Stéphane
Publication venue
Publication date: 06/07/2011
Field of study

We address the problem of finding the minimal number of block interchanges (exchange of two intervals) required to transform a duplicated linear genome into a tandem duplicated linear genome. We provide a formula for the distance as well as a polynomial time algorithm for the sorting problem

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

Parallel Position Weight Matrices Algorithms

Author: Giraud Mathieu
Varré Jean-Stéphane
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

International audiencePosition Weight Matrices (PWMs) are broadly used in computational biology. The basic problems, Scan and MultipleScan, aim to find all the occurrences of a given PWM or a set of PWMs in long sequences. Some other PWM tasks share a common NP-hard subproblem, ScoreDistribution. The existing algorithms rely on the enumeration on a large set of scores or words, and they are mostly not suitable for parallelization. We propose a new algorithm, BucketScoreDistribution, that is both very efficient and suitable for parallelization. We bound the error induced by this algorithm. We realized a GPU prototype for Scan, MultipleScan and BucketScoreDistribution with the CUDA libraries, and report for the different problems speedups larger than 10× on several Nvidia cards

HAL - Lille 3

INRIA a CCSD electronic archive server

Genome Halving by Block Interchange

Author: Thomas Antoine
Ouangraoua Aïda
Varré Jean-Stéphane
Publication venue: 'Scitepress'
Publication date: 01/02/2012
Field of study

International audienceWe address the problem of finding the minimal number of block interchanges (exchange of two intervals) required to transform a duplicated linear genome into a tandem duplicated linear genome. We provide a formula for the distance as well as a polynomial time algorithm for the sorting problem

HAL - Lille 3

RMIT Research Repository

Manycore high-performance computing in bioinformatics

Author: Giraud Mathieu
Janot Stéphane
Schmidt Bertil
Varré Jean-Stéphane
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2011
Field of study

Mining the increasing amount of genomic data requires having very efficient tools. Increasing the efficiency can be obtained with better algorithms, but one could also take advantage of the hardware itself to reduce the application runtimes. Since a few years, issues with heat dissipation prevent the processors from having higher frequencies. One of the answers to maintain Moore's Law is parallel processing. Grid environments provide tools for effective implementation of coarse grain parallelization. Recently, another kind of hardware has attracted interest: multicore processors. Graphic processing units (GPUs) are a first step towards massively multicore processors. They allow everyone to have some teraflops of cheap computing power in its personal computer. The CUDA library (released in 2007) and the new standard OpenCL (specified in 2008) make programming of such devices very convenient. OpenCL is likely to gain a wide industrial support and to become a standard of choice for parallel programming. In all cases, the best speedups are obtained when combining precise algorithmic studies with a knowledge of the computing architectures. This is especially true with the memory hierarchy: the algorithms have to find a good balance between using large (and slow) global memories and some fast (but small) local memories. In this chapter, we will show how those manycore devices enable more efficient bioinformatics applications. We will first give some insights into architectures and parallelism. Then we will describe recent implementations specifically designed for manycore architectures, including algorithms on sequence alignment and RNA structure prediction. We will conclude with some thoughts about the dissemination of those algorithms and implementations: are they today available on the bookshelf for everyone

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

A scenario of mitochondrial genome evolution in maize based on rearrangement events

Author: Darracq Aude
Touzet Pascal
Varré Jean-Stéphane
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Despite their monophyletic origin, animal and plant mitochondrial genomes have been described as exhibiting different modes of evolution. Indeed, plant mitochondrial genomes feature a larger size, a lower mutation rate and more rearrangements than their animal counterparts. Gene order variation in animal mitochondrial genomes is often described as being due to translocation and inversion events, but tandem duplication followed by loss has also been proposed as an alternative process. In plant mitochondrial genomes, at the species level, gene shuffling and duplicate occurrence are such that no clear phylogeny has ever been identified, when considering genome structure variation. Results: In this study we analyzed the whole sequences of eight mitochondrial genomes from maize and teosintes in order to comprehend the events that led to their structural features, i.e. the order of genes, tRNAs, rRNAs, ORFs, pseudogenes and non-coding sequences shared by all mitogenomes and duplicate occurrences. We suggest a tandem duplication model similar to the one described in animals, except that some duplicates can remain. Thi

CiteSeerX

HAL - Lille 3

Crossref

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

Genome dedoubling by DCJ and reversal

Author: Ouangraoua Aïda
Thomas Antoine
Varré Jean-Stéphane
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Segmental duplications in genomes have been studied for many years. Recently, several studies have highlighted a biological phenomenon called <it>breakpoint-duplication</it> that apparently associates a significant proportion of segmental duplications in Mammals, and the Drosophila species group, to breakpoints in rearrangement events. Results In this paper, we introduce and study a combinatorial problem, inspired from the breakpoint-duplication phenomenon, called the <it>Genome Dedoubling Problem.</it> It consists of finding a minimum length rearrangement scenario required to transform a genome with duplicated segments into a non-duplicated genome such that duplications are caused by rearrangement breakpoints. We show that the problem, in the Double-Cut-and-Join (DCJ) and the reversal rearrangement models, can be reduced to an APX-complete problem, and we provide algorithms for the Genome Dedoubling Problem with 2-approximable parts. We apply the methods for the reconstruction of a non-duplicated ancestor of <it>Drosophila yakuba.</it> Conclusions We present the <it>Genome Dedoubling Problem</it>, and describe two algorithms solving the problem in the DCJ model, and the reversal model. The usefulness of the problems and the methods are showed through an application to real Drosophila data.</p

HAL - Lille 3

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

ProCARs: Progressive Reconstruction of Ancestral Gene Orders

Author: Blanquart Samuel
Ouangraoua Aïda
Perrin Amandine
Varré Jean-Stéphane
Publication venue: HAL CCSD
Publication date: 28/10/2014
Field of study

International audienceBackground: In the context of ancestral gene order reconstruction from extant genomes, there exist two main computational approaches: rearrangement-based, and homology-based methods. The rearrangement-based methods consist in minimizing a total rearrangement distance on the branches of a species tree. The homology-based methods consist in the detection of a set of potential ancestral contiguity features, followed by the assembling of these features into Contiguous Ancestral Regions (CARs). Results: In this paper, we present a new homology-based method that uses a progressive approach for both the detection and the assembling of ancestral contiguity features into CARs. The method is based on detecting a set of potential ancestral adjacencies iteratively using the current set of CARs at each step, and constructing CARs progressively using a 2-phase assembling method. We show the usefulness of the method through a reconstruction of the boreoeutherian ancestral gene order, and a comparison with three other homology-based methods: AnGeS, InferCARs and GapAdj. The program is written in Python, and the dataset used in this paper are available at http://bioinfo.lifl.fr/procars/

HAL - Lille 3

Springer - Publisher Connector

INRIA a CCSD electronic archive server

PubMed Central

HAL Descartes

Hal-Diderot

Debugging long-read genome and metagenome assemblies using string graph analysis

Author: Chikhi Rayan
Marijon Pierre
Varré Jean-Stéphane
Publication venue: HAL CCSD
Publication date: 03/07/2017
Field of study

National audienceThird-generation long-read sequencing technologies tackle the repeat problem in genome assembly by producing reads that are long enough to span most repeat instances. In principle one expects that with such reads most bacterial genomes will be assembled into a single contig. However in practice, some datasets fail to be perfectly assembled even with leading assemblers, and are fragmented into a handful of contigs. As a mean to investigate those cases, we consider the string graphs that are generated by assemblers during intermediate stages of the assembly process. We seek to establish a coherent framework for analyzing these graphs, in the hope that they will help us determine the biological causes that led the assembler to output shorter contigs. This poster presents some preliminary results of such an analysis

INRIA a CCSD electronic archive server

Biomanycores, open-source parallel code for many-core bioinformatics

Author: Berthelot Jean-Frédéric
Deltel Charles
Giraud Mathieu
Janot Stéphane
Jourdan Laetitia
Lavenier Dominique
Touzet Helene
Varré Jean-Stéphane
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

International audienceBiomanycores is a collection of bioinformatics tools, designed to bridge the gap between researches in OpenCL/CUDA high-performance computing on GPU and other "manycore processors" and usual bioinformaticians and biologists

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL-Rennes 1

Biomanycores, open-source parallel code for many-core bioinformatics

Author: Berthelot Jean-Frédéric
Deltel Charles
Giraud Mathieu
Janot Stéphane
Jourdan Laetitia
Lavenier Dominique
Touzet Helene
Varré Jean-Stéphane
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

INRIA a CCSD electronic archive server